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ABSTRACT 


This  is  the  Final  Report  in  a twelve-year  effort  to  model  stochastic 
phenomena  and  develop  decision-making  techniques  under  risk  and  uncertainty. 
Recent  research  areas  which  received  major  emphasis  were: 

(1)  Basic  risk  decision  models,  with  emphasis  on  determining  the 
structure  of  optimal  policies  in  the  face  of  unknown  para- 
meters in  the  relevant  risk  distributions; 

(2)  Data  collection  and  parameter  estimation  with  emphasis  on 
linearized  Bayesian  methods. 
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I,  INTRODUCTION 

During  the  past  twelve  years  under  this  contract,  "Planning  and  Control 
Under  Risk,"  (DAAG  29-77-0-0040  and  predecessor  contracts),  faculty  and 
students  at  the  Operations  Research  Center,  University  of  California,  Berkeley, 
have  tackled  a variety  of  planning  and  control  problems,  with  increasing 
emphasis  on  decision  problems  under  risk;  a list  of  publications  since  1965 
is  in  the  Appendix.  In  this,  our  final  report,  we  summarize  the  work  that 
was  completed  during  the  final  period  of  this  grant,  November  1,  1976- 
November  1977. 

The  motivation  for  this  program  of  research  was  that  almost  all  realistic 
decision  problems  in  military  and  industrial  operations  contain  elements  of 
risk  or  uncertainty.  These  unknown  elements  limit  the  information  we  can 
gather  about  the  current  state  of  system,  and  make  future  states  unknowable, 
except  in  a statistical  sense,  thus  strongly  influencing  attitudes  towards 
the  decisions  to  be  made.  In  other  words,  the  utility  of  any  given  decision 
can  only  be  forecast  in  a probabilistic  manner,  and  sometimes  cannot  be  evalu- 
ated retrospectively  without  residual  uncertainty.  Thus,  planning  and  control 
strategies  must  always  reflect  these  risks  and  uncertainties. 

There  are  a variety  of  models  and  methods  which  appear  under  the  heading 
of  risk  theory;  generally  they  share  the  following  characteristics: 

(1)  There  is  usually  a probabilistic  law  of  motion  governing  the 
system.  In  simple  cases,  this  may  be  a binary  gamble,  increas- 
ing or  decreasing  the  wealth  available  for  future  risk  taking; 
in  more  complicated  models,  a Markov- renewal  or  diffusion  law 
may  change  the  state  of  the  system  over  time. 

(2)  Transitions  usually  create  costs  or  generate  profits,  and 
these,  too,  may  be  uncertain. 


(3)  There  is  usually  a concern  about  process  boundary  conditions, 
which  can  lead  to  ruin  (financial,  insurance,  gambling  models), 
to  catastrophes  (design  of  dams  and  nuclear  reactors,  responses 
to  fires  and  epidemics,  etc.),  or  simply  to  the  termination 

of  the  game  (reliability  and  human-  or  corporate-lifetime  models). 
Sometimes  the  ability  to  deliberately  terminate  a (losing)  game 
is  the  feature  of  primary  concern,  as  in  optimal  stopping  rules. 
Conversely,  in  some  stochastic  optimization  problems,  there  may  be 
some  mathematical  embarrassment  associated  with  continuing  "forever,” 
and  turnpike  theorems,  discounting,  or  absorbing  states  may  have  to 
be  invoked  to  avoid  analytic  difficulties. 

(4)  Expected  total  return  or  average  rate  of  return  may  not  be  a satis- 
factory system  objective;  for  instance,  the  decision-maker  may  be 
just  as  concerned  about  the  fluctuations  of  reward  under  a certain 
policy  as  about  the  average  reward.  This  may  lead  to  the  use  of 
utility  theory  as  an  axiomatic  way  of  specifying  the  decision-makers 
risk- aversion,  to  more  explicit  multiple-objective  formulations,  or 

to  nature-as-the-opponent  minimax  functionals.  Many  interesting  risk- 
sharing problems  require  multiple-player  pareto-optimality  approaches. 
Problems  with  long  horizons  usually  require  specific  recognition  of 
the  utility  of  time. 

(5)  Finally,  there  are  always  important  data  measurement  and  parameter 
estimation  problems  associated  with  decision  under  risk.  Not  only 
may  the  observations  themselves  be  subject  to  error,  but  there  are 
usually  many  more  parameters  to  estimate  for  stochastic  problems 
than  for  deterministic  ones.  An  important  part  of  many  dynamic  deci- 
sion models  is  the  provision  for  continuing  updating  of  the  estimators. 
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There  may  or  may  not  be  prior  information  upon  which  to  base 
initial  decisions.  The  mechanics  of  combining  estimates  and  data 
are  often  laborious  in  a real  problem,  and  the  costs  of  updating 
information  must  often  be  included  in  setting  up  a model. 

We  feel  strongly  that  this  can  continue  to  be  a fruitful  research  area. 

In  fact,  a heightened  sensitivity  to  the  problems  of  risk  within  our  industries 
and  government  agencies  has  been  demonstrated  in  recent  years,  as  a new  problems 
of  the  environment,  concerns  over  worker  and  population  safety,  resource  and 
energy  shortages,  etc.  have  been  encountered.  Economic  planning  must  now 
take  into  account  possible  sudden  shifts  in  available  resources,  the  uncertainty 
of  long-term  plans,  and  the  financial  interdependence  of  sectors  of  our  society. 
We  no  longer  have  the  luxury  of  a certain  world,  but  must  develop  decision 
options  for  a variety  of  uncertain  alternate  scenarios. 

In  closing  this  long  and  prductive  relationship  with  the  Army  Research 
Office,  the  principal  investigators  would  like  to  thank  Army  personnel  for  their 
continued  support  and  encouragement.  The  educational  benefits  derived  from 
this  effort  were  immeasurable,  and  it  is  hoped  that  the  research  output  was 
ultimately  useful. 


II.  AREAS  OF  RESEARCH  (1976-1977) 


A.  Models  of  Decision  Under  Risk 

1 . Renewal  Decision  Hodels 

Consider  a system  that  must  operate  for  a 
fixed  length  of  time  and  suppose  that  a certain 
component  is  essential  for  the  system  to  be  operative. 
When  this  component  fails  it  must  be  replaced.  Eowever 
there  are  n different  types  of  this  component  that 
can  be  used;  the  ith  type  costing  the  amount 
and  functioning  for  an  exponentially  distributed  length 
of  time  with  rate  . Thus,  typically,  the  more 

expensive  types  will  last  longer.  This  model  was  con- 
sidered in  [1]  where  the  main  problem  was  to  determine 
the  optimal  strategy  for  assigning  replacements  for 
the  failed  components  as  a function  of  the  remaining 
life  time  of  the  system.  It  was  shown  that  the  optimal 
policy  has  a particularly  nice  structure;  namely  that 
the  time  axis  can  be  divided  into  n intervals  such 
that  when  a replacement  has  to  be  made  it  is  optimal 
to  select  a spare  from  the  category  having  the  ith 
largest  value  of  AC  whenever  the  remaining  time 
falls  into  the  ith  closest  interval  to  the  origin. 
This  special  structure  was  then  exploited  to  obtain 
an  efficient  algorithm  for  the  determination  of  the 
critical  switch  points. 
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Other  variations  of  the  above  model  were  also 
considered  in  [1].  In  particular  the  case  of  only 
2 types  of  components  is  considered  when  there  is 
only  a single  spare  of  one  of  these  types  and  an 
infinite  surplus  of  the  other.  Also  the  optimal 
policy  is  completely  determined  when  there  is  only 
a finite  number  of  certain  of  the  types  under  the 
assumption  that  if  rebate  is  given  for  the  component 
in  use  at  the  end  of  the  systems  life. 

2 . Gambling  Models 

In  order  to  obtain  some  insight  into  the  struc- 
ture of  optimal  policies  in  risk  models,  a class  of 
gambling  models,  useful  as  simple  prototypes  for 
risk  models,  was  considered  in  [2].  For  a variety 
of  objectives,  it  was  shown  that  if  the  game  is  favor- 
able to  the  player,  then  he  should  play  as  timidly 
as  possible;  that  is,  always  make  the  smallest  bet. 

A model  in  which  the  gambler  is  also  given  the  option 
of  working  is  considered,  and  it  is  shown  that  if 
the  available  gambles  are  unfavorable  then  the  strategy 
which  minimizes  the  gambler Ts  expected  time  to  reach 
some  preassigned  goal  is  the  strategy  that  always  calls 
for  working.  For  the  same  model  it  is  also  shown  that 
if  the  work  option  is  only  available  at  certain  times 
(namely,  when  the  gambler  is  broke)  then  the  optimal 
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gambling  strategy  is  to  play  boldly.  These  results 
were  obtained  by  developing  some  new  general  results 
in  dynamic  programming,  also  given  in  [2], 

The  study  of  these  models  was  continued  in  the 
Ph.D.  thesis  [3]  of  E.  Subelman,  a student  of  Professor 
Ross.  This  thesis  considered  the  problem  where  a 
decision-maker  is  allowed  to  gamble  with  his  objective 
being  to  maximize  the  (expected)  utility  of  his  final 
fortune.  It  is  supposed  that  if  he  bets  an  amount  y 
then  his  return  from  this  gamble  will  be  yR  , where 
R is  a random  variable  whose  distribution  is  known 
to  the  gambler,  the  classical  case  being  when 

!2  with  probability  n 

0 with  probability  1 - p . 

The  objective  of  this  research  is  to  determine  proper- 
ties of  y (x)  » optimal  bet  when  the  gambler's 

fortune  is  x and  there  are  n gambles  to  go,  under 
the  assumption  of  an  (arbitrary)  concave  utility 
function.  Some  of  the  results  obtained  were  that, 
in  the  classical  case,  the  optimal  amount  to  save 
and  to  strive  for  are  both  increasing  in  the  gambler's 
fortune.  That  is,  in  the  classical  case,  both 

x - y (x) 


x + yn(x) 


and 
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are  increasing  functions  of  x . In  addition  it  is 
shown  that,  as  a function  of  p , y^(x) 

nondecreasing.  Thus  the  more  favorable  the  game 
the  more  one  should  bet.  These  results  were  then 
generalized  to  the  adaptive  situation  in  which  p 
is  unknown  and  information  about  p is  thus  obtained 
from  the  outcome  of  the  gambles. 

3 . Optimal  Inspection  Policies 

In  a Ph.D.  thesis  written  under  Ross1  super- 
vision and  partially  supported  by  ARO,  Levin  [4]  consid- 
ered a wide  variety  of  models  for  maintaining  systems 
when  inspection  is  costly.  In  particular,  his  models 
include  the  classical  assumption  that  a system  changes 
states  in  accordance  with  a Markov  chain,  but  the  true 
state  remains  unknown  until  a costly  inspection  is  made. 
Thus  one  has  to  balance  the  cost  of  inspection  with  the 
possibly  high  costs  of  being  in  a bad  state.  Levin  has 
obtained  some  new  and  interesting  structural  results 
for  this  model.  Another  model  considered  is  an  inven- 
tory situation  in  which,  for  a given  order,  the  number 
of  items  actually  received  is  a random  variable;  once 
again  interesting  structural  results  are  obtained. 

4 . The  Newsboy  Problem 

One  of  the  most  elementary  risk  decision  problems 
is  the  no-carryover  ordering  problem  known  as  the  news- 
boy problem.  As  a preliminary  to  studying  a Bayesian 


version  of  this  problem  (see  Section  II) , the  follow- 
ing variant  was  investigated  by  R.  Levin,  under 
Jewell's  supervision  [5]: 

Given  a known  demand  distrubution,  F(x)  , 
a "newsboy"  places  an  order  z^  . If  then 
actual  sales,  x , grow  larger  than  z^  , he  can 
make  an  instantaneous  replenishment  in  amount  z ^ , 
and  then  if  x > z^  + z^  , he  can  make  another 
instantaneous  replenishment,  and  so  on  . ..,  to 
some  given  maximal  number  of  reorders,  N . 

Costs  include  not  only  variable  costs  or  ordering 
and  profits  per  sale,  but  possibly  fixed  costs 
of  replenishment. 

Naturally,  the  initial  orders  depend  not  only  upon  the  cost 
parameter,  but  upon  the  number  of  remaining  possible  orders 
(one  does  not  always  reorder  even  when  permitted) , as  well 
as  the  shape  of  the  residual  demand  distribution. 

Other  Work  Completed 

In  November,  1976,  W.  S.  Jewell  was  invited  to 
present  a paper  to  the  Royal  Society  (London)  Discus- 
sion Meeting  on  "The  Use  of  Operational  Research  and 
System  Analysis  in  Decision  Making."  Preparation 
of  survey  paper,  "The  Analytic  Methods  of  Operations 
Research"  [6]  was  partly  supported  by  this  contract, 
and  by  the  Office  of  Naval  Research. 
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B.  Estimation  Problems 
1.  Introduction 

Most  multistage  decision  problems  under  risk  require  pro- 
cessing of  large  amounts  of  data,  either  for  initial  estimates 
of  the  parameters,  or  to  provide  continuing  updating  as  new 
information  is  received  from  prior  decisions . Thus,  efficient 

methods  of  estimation  are  important. 

Suppose  we  can  observe  a random  variable  x which  depends 

upon  a parameter  0 in  a known  way,  via  the  likelihood  density 
p(x|e)  ; we  assume  a prior  parameter-  density , p(3)  , is  avail- 

able. Prior-to-data  predictions  about  an  average  x can  then 
be  made  through  the  mixed  density  p(x)  = EQp(x|0)  . 

Now  suppose  that,  as  a result  of  some  decision  about  experimenta- 
tion and  sampling,  we  observe  n independent  samples  of  the 

random  variable,  2E  = = X£  » ^ = ^ ^ ' 5 * E-’r  us-^n§ 

Bayes’  theorem,  the  parameter  density  posterior-to-data  becomes 

( Hp(xt|0)p(0) 

(!)  p(0lx)  = j np(xtl^)p(^)d^  • 

In  decision  problems,  we  are  not  usually  so  interested  in 
estimating  the  true  value  of  parameter  9 , as  we  are  in  the 
forecast  density  for  xn+1  = y , the  next  observation.  Thus, 
practical  control  problems  require  the  calculation  oj.  . 

(2)  p(y|x)  = EejxP<‘y^9'>  = /p  (y I 0)p (0 |x)dQ 

For  instance,  in  decision  analysis,  as  decisions 


or  its  moments. 


to  perform  experiments  are  made,  the  information  from  the 
experiments  is  used  to  recompute  expected  utilities  of  one  or 
more  future  actions. 

Practically  speaking,  computations  via  (1)  and  (2)  are 
laborious,  and  require  either  large  computer  capability,  or 
the  use  of  a few  well-known  natural  conjugate  prior  families 
of  likelihood  and  prior.  This  requires  knowing  (or  assuming) 
a great  deal  of  structure  information  about  p(0)  a nd 
p(x|0)  . But,  as  indicated  above,  most  of  this  information  is 
"wasted, " especially  if  we  only  want  moments  or  an  expected 
utility. 

In  the  1920fs  American  actuaries  developed  a class  of 
estimators  called  "credibility  formulae"  to  predict  the  mean 
of  the  next  risk  outcome,  x > given  the  "experience  data," 
x . Using  simple  heuristic  arguments,  they  derived  a linear 
forecast  formula  f(x)  : 


E{xn+^|x}  ~ f(x)  - (1  = z)-m  4-  z 


(« 1 


(3) 


2 = 


n 


n + N 


1 y 

where  m is  the  prior  mean  of  p(y)  (no  data),  — l x is  the 
sample  mean  of  data,  and’  N is  a time  constant,  chosen  originally 
in  an  ad  hoc  manner.  The  most  interesting  features  of  this 
formula  are:  (1)  It  uses  the  data  in  a simple  linear  fashion. 


mixing  it  with  the  prior  estimate;  (2)  it  worked  extremely  well 
for  40  years  in  casualty  insurance  experience  rating. 
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In  the  1950Ts  it  was  discovered  that  (3)  was  in  fact  the  exact 
Bayesian  result  E{x^_^  |_x)  (gotten  from  (1)  and  (2))  if  the 
prior  and  likelihood  were  the  natural  conjugate  priors:  Beta- 
Binominal,  Gamma-Poisson,  and  Normal-Normal.  Then  in  1967, 
it  was  shown  that  the  credibility  forecast  (3)  is  the  best  least- 
squares  approximation  for  arbitrary  distributions,  if  N is 
chosen  as  the  ratio  of  the  two  components  of  prior  variance. 

2.  Recent  Research 

In  an  extensive  research  effort  supported  by  ARO  since  1973, 
we  have  been  able  to  develop  a variety  of  interesting  and  useful 
extensions  to  credibility  estimators: 

(1)  Proof  that  (3)  is  exact  Bayesian  for  the  Koopman-Pitman- 
Darmois  exponential  family  for  which  x is  a sufficient 
statistic; 

(2)  Extensions  of  least-square  and  exact  Bayesian  theory  to 
vector  valued  random  variables; 

(3)  Estimates  of  probabilities,  P(y|x)  , for  fixed  y ; 

(4)  Numerous  extensions  to  special  model  structures,  including 
hierarchial  models; 

(5)  Direct  and  inverse  Bayesian  regression  estimators, 

with  applications  to  calibration  of  instruments  and  measure- 
ments of  network  flows. 

These  developments  (References  8 through  25)  have  been  reported  in  detail 

in  previous  proposals  and  reports. 

During  the  interval  from  July  1976  to  November  1976,  related 
but  independent  research  was  carried  out  for  the  Lawrence  Livermore 


Laboratory  on  estimation  problems  associated  with  material  account- 
ability problems  [24]  [25]. 

In  August,  1976,  W.  S.  Jewell  was  invited  to  Osaka,  Japan 
to  the  annual  Management  Science  Colloquium  of  Japan.  For  this 
occasion,  he  prepared  a survey  paper  on  the  many  recent  contri- 
butions to  credibility  theory  [23]. 

In  [21],  (supported  by  ARO  under  a different  contract),  a 
new  family  of  likelihoods,  called  the  prop ortiona l - haz cord- 
family , was  proposed  to  facilitate  Bayesian  computations  in  life- 
testing, where  there  are  incomplete  observations  of  the  form 
x > T . This  work  has  been  further  extended  in  [26].  One  of 
the  questions  examined  in  this  paper  is  the  following: 

An  item  with  random  lifetime  density  p(x|8) 
has  "lived11  until  age  T . Taking  into  account  the 
"learning"  about  9 which  occurs  from  the  datum 
{x  > T}  , could  we  predict  the  distribution  of  remaining 
life  in  any  different  way  than  by  using  the  prior 
mixed  density  p(x)  = Cp(x|8)  in  the  obvious  way? 

Not  surprisingly,  the  single  datum  provides  no  additional  informa- 
tion about  the  remaining  life  through  changing  estimates  of  8 . 

If,  however,  there  are  several  items,  all  with  the  same  parameter, 
that  are  "on  test"  at  age  T , then  there  is  a changed  perception 
about  the  common  remaining  life  distribution.  This  result  has 
implications  for  equipment  sinking  fund  analyses,  and  other 
"learning  reserves”  models. 
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